Discovering Synonyms And Other Related Words

نویسندگان

  • Krister Linden
  • Jussi Piitulainen
چکیده

Discovering synonyms and other related words among the words in a document collection can be seen as a clustering problem, where we expect the words in a cluster to be closely related to one another. The intuition is that words occurring in similar contexts tend to convey similar meaning. We introduce a way to use translation dictionaries for several languages to evaluate the rate of synonymy found in the word clusters. We also apply the information radius to calculating similarities between words using a full dependency syntactic feature space, and introduce a method for similarity recalculation during clustering as a fast approximation of the high-dimensional feature space. Finally, we show that 69-79% of the words in the clusters we discover are useful for thesaurus construction.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Application of Frame Semantics to Teaching Seeing and Hearing Vocabulary to Iranian EFL Learners

A term in one language rarely has an absolute synonymous meaning in the same language; besides, it rarely has an equivalent meaning in an L2. English synonyms of seeing and hearing are particularly grammatically and semantically different. Frame semantics is a good tool for discovering differences between synonymous words in L2 and differences between supposed L1 and L2 equivalents. Vocabulary ...

متن کامل

Word Type Effects on L2 Word Retrieval and Learning: Homonym versus Synonym Vocabulary Instruction

The purpose of this study was twofold: (a) to assess the retention of two word types (synonyms and homonyms) in the short term memory, and (b) to investigate the effect of these word types on word learning by asking learners to learn their Persian meanings. A total of 73 Iranian language learners studying English translation participated in the study. For the first purpose, 36 freshmen from an ...

متن کامل

Finding Synonyms Using Automatic Word Alignment and Measures of Distributional Similarity

There have been many proposals to extract semantically related words using measures of distributional similarity, but these typically are not able to distinguish between synonyms and other types of semantically related words such as antonyms, (co)hyponyms and hypernyms. We present a method based on automatic word alignment of parallel corpora consisting of documents translated into multiple lan...

متن کامل

Word Sense Disambiguation for Information Retrieval

Motivation: Documents related to an IR query sometimes contain only the synonyms of the query words instead of the query words themselves. A simple IR system with no knowledge of synonyms fails to recognize the relevance of these documents to the query. So, we can improve recall of IR systems by considering the synonyms of the query words as a part of the IR query. However, only relevant synony...

متن کامل

Getting Synonym Candidates from Raw Data in the English Lexical Substitution Task

Distributional similarity provides a technique for obtaining semantically related words from corpus data using automated methods that compare the contexts in which the words appear. Such methods can be useful for producing thesauruses, with application to work in lexicography and computational linguistics. However, the most similar words produced using these methods are not always near synonyms...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2004